A Hybrid Strategy For Regular Grammar Parsing
نویسندگان
چکیده
The paper outlines a hybrid architecture for a partial parser based on regular grammars over XML documents. The parser is used to support the annotation process in the BulTreeBank project. Thus the parser annotates only the ‘sure’ cases. To maximize the number of the analyzed phrases the parser applies a set of grammars in a dynamic fashion. Each grammar determines not only the constituent structure (plus some syntactic dependencies internal to the structure), but also a description of the local and global context of the recognized phrase. The grammars available to the parser are arranged in a network. The order of the grammars application depends on the initial ordering in the network and the descriptions associated with the grammars. Thus the traverse is not deterministic. Additionally, the application of the grammars can be interleaved with the applications of other XML tools like remove, insert and transform operations. This architecture provides a flexible means for guiding the linguistic analysis in order to utilize all the available linguistic knowledge and to produce a very accurate partial analysis.
منابع مشابه
A Generalized View on Parsing and Translation
We present a formal framework that generalizes a variety of monolingual and synchronous grammar formalisms for parsing and translation. Our framework is based on regular tree grammars that describe derivation trees, which are interpreted in arbitrary algebras. We obtain generic parsing algorithms by exploiting closure properties of regular tree languages.
متن کاملParsing with Pictures
The development of elegant and practical algorithms for parsing context-free languages is one of the major accomplishments of 20 century Computer Science. These algorithms are presented in the literature using string rewriting systems or abstract machines like pushdown automata, but the resulting descriptions are unsatisfactory for several reasons. First, even a basic understanding of parsing a...
متن کاملTests for the LR-, LL-, and LC-Regular Conditions
Most of the linear time parsing strategies (e.g., LL(k) and U(k) type parsers) for context-free grammars operate by looking ahead on the input tape for a fixed number of symbols. The fixed length look-ahead strings partition the set of input strings into classes of strings which are equivalent with respect to parsing decisions. A moment’s thought shows that these look-ahead classes are regular ...
متن کاملEfficacy of Beam Thresholding, Unification Filtering and Hybrid Parsing in Probabilistic HPSG Parsing
We investigated the performance efficacy of beam search parsing and deep parsing techniques in probabilistic HPSG parsing using the Penn treebank. We first tested the beam thresholding and iterative parsing developed for PCFG parsing with an HPSG. Next, we tested three techniques originally developed for deep parsing: quick check, large constituent inhibition, and hybrid parsing with a CFG chun...
متن کاملFrom LL-Regular to LL (1) Grammars: Transformations, Covers and Parsing
— In this paper it is shown that it is possible to transform any LL-regular grammar G into an LL{\) grammar G' in such a way that parsing G' is as good as parsing G. That ist a par se of a sentence of grammar G can be obtained with a simple string homomorphism from the parse of a corresponding sentence of G'. Since any LL (k) grammar is an LL-regular grammar the results which are obtained are v...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004